Grammar-based tools for the creation of tagging resources for an unresourced language: the case of Northern Sotho

نویسندگان

  • Ulrich Heid
  • Elsabé Taljard
  • Danie J. Prinsloo
چکیده

We describe an architecture for the parallel construction of a tagger lexicon and an annotated reference corpus for the part-of-speech tagging of Nothern Sotho, a Bantu language of South Africa, for which no tagged resources have been available so far. Our tools make use of grammatical properties (morphological and syntactic) of the language. We use symbolic pretagging, followed by stochastic tagging, an architecture which proves useful not only for the bootstrapping of tagging resources, but also for the tagging of any new text. We discuss the tagset design, the tool architecture and the current state of our ongoing effort.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

Exploring Persian Commercials Based on the Halliday’s Systemic-Functional Grammar

Advertisement has long been used as a tool for informing and attracting audiences in different ways. This study aims at investigating the linguistic tools of advertisement in Persian on the basis of Halliday’s systemic-functional grammar theory. The data of this study were gathered from written and verbal commercial advertisements which were recorded and rewritten in order to investigate verbal...

متن کامل

Universal Grammar and Chaos/Complexity Theory: Where Do They Meet And Where Do They Cross?

  Abstract The present study begins by sketching "Chaos/Complexity Theory" (C/CT) and its applica-tion to the nature of language and language acquisition. Then, the theory of "Universal Grammar" (UG) is explicated with an eye to C/CT. Firstly, it is revealed that CCT may or may not be allied with a theory of language acquisition that takes UG as the initial state of language acquisition for ...

متن کامل

Redefining conceptions of grammar in English education in Asia: SFL in practice

This  case  study  analyzes  how  a  Taiwanese  EFL  teacher  participating  in  a  U.S.  based MATESOL program made sense of systemic functional linguistics (SFL) and genre based pedagogy  in  designing  and  reflecting  on  literacy  instruction  for  EFL  learners  in  Taiwan. Using  longitudinal  ethnographic  methods,  the  findings  indicate  that  this  teacher’s conceptualization  of  g...

متن کامل

Comparing confidence-based and conventional scoring methods: The case of an English grammar class

This study aimed at investigating the reliability, predictive validity, and self-esteem and gender bias of confidence-based scoring. This is a method of scoring in which the test takers receive a positive or negative point based on their rating of their confidence in an answer. The participants, who were 49 English-major students taking their grammar course, were given 8 multiple-choice tests d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006